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IN THE CLAIMS: 

This listing of the claims replaces all prior versions and 
listings of the claims. 

Claim 1. (currently amended) A method for generating 
speech recognition models, the method comprising: 

converting speech spoken from a plurality of female 
speakers into a first set of recorded phonemes training data; 

converting speech spoken from a plurality of male speakers 
into a second set of recorded phonemes training data; 

receiving a first speech recognition model based on a the 
first set of recorded phonemes training dataT — fefee — f irot — se£ — e-# 
recorded phonemes training data originating from a plurality of 
female speakers ; 

receiving a second speech recognition model based on a- the 
second set of recorded phonemes training dataT — the second act of 
recorded phonemes training data originating from a plurality of 
male speakers ; 

determining a difference in model information between the 
first speech recognition model and the second speech recognition 
model; and 

creating a gender-independent speech recognition model 
based on the first set of recorded phonemes training data and 
the second set of recorded phonemes training data if the 
difference in model information is insignificant. 
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Claim 2. (original) The method of claim 1, wherein whether 
the model information is insignificant is based on a threshold 
model quantity. 

Claim 3. (previously presented) The method of claim 1, 
wherein determining the difference in model information includes 
calculating a Kullback Leibler distance between the first speech 
recognition model and second speech recognition model. 

Claim 4. (original) The method of claim 3, wherein whether 
the model information is insignificant is based on a threshold 
Kullback Leibler distance quantity. 

Claim 5. (previously presented) The method of claim 1, 
wherein the first speech recognition model, second speech 
recognition model, and gender-independent speech recognition 
model are Gaussian mixture models. 

Claim 6. (currently amended) A system for generating 
speech recognition models, the method system comprising: 

a computer processor; 

a first speech recognition model based on a first set of 
training data, the first set of training data originating from a 
first set of common entities; 

a second speech recognition model based on a second set of 
training data, the second set of training data originating from 
a second set of common entities; and 
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a processing module configured to create an independent 
speech recognition model based on the first set of training data 
and the second set of training data if the difference in model 
information between first speech recognition model and the 
second speech recognition model is insignificant. 

Claim 7. (original) The system of claim 6, wherein whether 
the model information is insignificant is based on a threshold 
model quantity. 

Claim 8. (previously presented) The system of claim 6, 
wherein the processing model is further configured to calculate 
a Kullback Leibler distance between the first speech recognition 
model and second speech recognition model. 

Claim 9. (original) The system of claim 8, wherein whether 
the model information is insignificant is based on a threshold 
Kullback Leibler distance quantity. 

Claim 10. (currently amended) The method system of claim 
6, wherein the first speech recognition model, second speech 
recognition model, and independent speech recognition model are 
Gaussian mixture models. 

Claim 11. (previously presented) A computer program 
product embodied in computer memory comprising: 
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computer readable program codes coupled to the computer 
memory for generating speech recognition models, the computer 
readable program codes configured to cause the program to: 

receive a first speech recognition model based on a first 
set of training data, the first set of training data originating 
from a first set of common entities; 

receive a second speech recognition model based on a second 
set of training data, the second set of training data 
originating from a second set of common entities; 

determine a difference in model information between the 
first speech recognition model and the second speech recognition 
model; and 

create an independent speech recognition model based on the 
first set of training data and the second set of training data 
if the difference in model information is insignificant. 

Claim 12. (original) The computer program product of claim 
11, wherein whether the model information is insignificant is 
based on a threshold model quantity. 

Claim 13. (original) The computer program product of claim 
11, wherein determining the difference in model information 
includes calculating a Kullback Leibler distance between the 
first model and second model. 

Claim 14. (original) The computer program product of claim 
13, wherein whether the model information is insignificant is 
based on a threshold Kullback Leibler distance quantity. 
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Claim 15. (previously presented) The computer program 
product of claim 11, wherein the first speech recognition model, 
second speech recognition model, and independent speech 
recognition model models are Gaussian mixture models. 

Claim 16. (currently amended) A system for generating 
speech recognition models, the method comprising: 

a computer processor; 

a first speech recognition model based on a first set of 
training data, the first set of training data originating from a 
first set of common entities; 

a second speech recognition model based on a second set of 
training data, the second set of training data originating from 
a second set of common entities; and 

means for creating an independent speech recognition model 
based on the first set of training data and the second set of 
training data if the difference in model information between 
first speech recognition model and the second speech recognition 
model is insignificant. 

Claim 17. (currently amended) A method for recognizing 
speech from an audio stream originating from one of a plurality 
of data classes, the method comprising: 

converting the speech into the audio stream; 

receiving a current feature vector of the audio stream; 

computing a current vector probability that the current 
feature vector belongs to one of the plurality of data classes; 
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computing an accumulated confidence level that the audio 
stream belongs to one of the plurality of data classes based on 
the current vector probability and on previous vector 
probabilities ; 

weighing class models based on the accumulated confidence; 

and 

recognizing the current feature vector based on the 
weighted class models; and 

wherein the plurality of data classes include a female 
speech recognition model based on recorded phonemes originating 
from plurality of female speakers, a male speech recognition 
model based on recorded phonemes originating from plurality of 
male speakers, and a gender-independent speech recognition model 
based on recorded phonemes originating from plurality of both 
female and male speakers having insignificant differences in 
information . 

Claim 18. (original) The method of claim 17, wherein 
computing the current vector probability includes estimating an 
a posteriori class probability for the current feature vector. 

Claim 19. (original) The method of claim 17, wherein 
computing the accumulated confidence level further comprising 
weighing the current vector probability more than the previous 
vector probabilities. 

Claim 20. (original) The method of claim 17, further 
comprising determining if another feature vector is available 
for analysis. 
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Claim 21. (currently amended) A system for recognizing 
speech data from an audio stream originating from one of a 
plurality of data classes, the system comprising: 

a computer processor; 

a receiving module configured to receive a current feature 
vector of the audio stream; 

a first computing module configured to compute a current 
vector probability that the current feature vector belongs to 
one of the plurality of data classes; 

a second computing module configured to compute an 
accumulated confidence level that the audio stream belongs to 
one of the plurality of data classes based on the current vector 
probability and on previous vector probabilities; 

a weighing module configured to weigh class models based on 
the accumulated confidence; and 

a recognizing module configured to recognize the current 
feature vector based on the weighted class models; and 

wherein the plurality of data classes include a first 
speech recognition model based on recorded phonemes originating 
from a first set of speakers, a second speech recognition model 
based on recorded phonemes from a second set of speakers, and a 
third speech recognition model based on recorded phonemes 
originating from both the first and second set of speakers 
having insignificant differences in information. 

Claim 22. (original) The system of claim 21, wherein the 
first computing module is further configured to estimate an a 
posteriori class probability for the current feature vector. 

Page 8 of 24 



Serial No: 10/649, 909 



Claim 23. (original) The system of claim 21, wherein the 
second computing module is further configured to weigh the 
current vector probability more than the previous vector 
probabilities . 

Claim 24. (previously presented) A computer program 
product embodied in computer memory comprising: 

computer readable program codes coupled to the computer 
memory for recognizing speech data from an audio stream 
originating from one of a plurality of data classes, the 
computer readable program codes configured to cause the program 
to: 

receive a current feature vector of the audio stream; 

compute a current vector probability that the current 
feature vector belongs to one of the plurality of data classes; 

compute an accumulated confidence level that the audio 
stream belongs to one of the plurality of data classes based on 
the current vector probability and on previous vector 
probabilities ; 

weigh class models based on the accumulated confidence; and 

recognize the current feature vector based on the weighted 
class models; and 

wherein the plurality of data classes include a first 
speech recognition model based on recorded phonemes originating 
from a first set of speakers, a second speech recognition model 
based on recorded phonemes from a second set of speakers, and a 
third speech recognition model based on recorded phonemes 
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originating from both the first and second set of speakers 
having insignificant differences in information. 

Claim 25. (original) The computer program product of claim 
24, wherein the program code configured to compute the current 
vector probability includes program code configured to determine 
an a posteriori class probability for the current feature 
vector . 

Claim 26. (original) The computer program product of claim 
24, wherein the program code configured to compute the 
accumulated confidence level includes program code configured to 
weigh the current vector probability more than the previous 
vector probabilities. 

Claim 27. (original) The computer program product of claim 
24, further comprising program code configured to determine if 
another feature vector is available for analysis. 
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